Poverty Data Update

Analysis

Excel workbook with updated and PowerPoint templates populated with poverty and housing insecurity data by Catalyst California for the First 5 LA Office of Data for Action. This work is part of Catalyst California’s ongoing strategic partnership and technical support for First 5 LA departments.

Catalyst California catalystcalifornia.org
2025-03-21

The I&A team at First 5 LA asked Catalyst California for updated data on poverty among households with children under five in Los Angeles County. The data are by racial-ethnic group, Service Planning Area (SPA) and county for 2022 and 2023. For this data update, Catalyst California analyzed poverty at both the 100% FPL level and the 138% FPL level. Data is analyzed for both FPL levels for years 2022 and 2023. Estimates that are cut by race use 2022 and 2023 5-year estimates. Estimates at the SPA level and county level use 2022 and 2023 1-year estimates.

Link to the workbook

Catalyst California also created a PUMA to SPA crosswalk map to visualize which PUMAs across Los Angeles County fall within each of the five SPAs.The code for creating this crosswalk and the map visualization can be found further below on this page.

Map of Public Use Microdata Areas (PUMAs), LA County Service Planning Areas (SPAs), and Crosswalk

Data for households with children 0-5 in poverty is available by PUMA and the request is for that data by SPA. We created a PUMA-SPA crosswalk to facilitate calculating numbers and percentages of households with children 0-5 by SPA.

There are 71 PUMAs in LA County with blue borders with identification numbers between 03703 and 03782, with 037 being the LA County identifier. These numbers are on the initial labeling on the map hover.

There are 8 SPAs in LA County with black borders numbered one through eight below. You can see these labels by turning off/unchecking the PUMA layer. Note: SPA 7 has a cut out of SPA 8 in the Signal Hill area.

The crosswalk of PUMAs to SPAs we use to attribute household 0-5 poverty data are the colored areas, colored by SPA. The areas without colors are the smaller portions of PUMAs that overlap multiple SPAs. For example, the non-colored portion of PUMA 03709 (at the center of the map in La Canada Flintridge) is the portion of PUMA 03709 that falls in SPA 2. The vast majority of PUMA 03709 falls in SPA 3, so it is attributed to SPA 3 and not SPA 2. We don’t attribute fractions of PUMAs to SPAs because the additional the potential increase in accuracy is outweighed by a decrease in statistical stability.

Below is the code for visualizing the PUMAs to SPAs crosswalk:

Reading layer `Service_Planning_Areas__2022_' from data source 
  `W:\Project\RDA Team\First5LA\Ad-hoc Research\Laura ask poverty update\Shapes\Service_Planning_Areas__2022_.shp' 
  using driver `ESRI Shapefile'
Simple feature collection with 8 features and 8 fields
Geometry type: MULTIPOLYGON
Dimension:     XY
Bounding box:  xmin: 6275488 ymin: 1386069 xmax: 6668481 ymax: 2122085
Projected CRS: NAD83 / California zone 5 (ftUS)
  |                                                                    |                                                            |   0%  |                                                                    |                                                            |   1%  |                                                                    |=                                                           |   1%  |                                                                    |=                                                           |   2%  |                                                                    |==                                                          |   3%  |                                                                    |==                                                          |   4%  |                                                                    |===                                                         |   4%  |                                                                    |===                                                         |   5%  |                                                                    |===                                                         |   6%  |                                                                    |====                                                        |   6%  |                                                                    |====                                                        |   7%  |                                                                    |=====                                                       |   8%  |                                                                    |=====                                                       |   9%  |                                                                    |======                                                      |   9%  |                                                                    |======                                                      |  10%  |                                                                    |======                                                      |  11%  |                                                                    |=======                                                     |  11%  |                                                                    |=======                                                     |  12%  |                                                                    |========                                                    |  13%  |                                                                    |========                                                    |  14%  |                                                                    |=========                                                   |  15%  |                                                                    |=========                                                   |  16%  |                                                                    |==========                                                  |  16%  |                                                                    |==========                                                  |  17%  |                                                                    |===========                                                 |  18%  |                                                                    |===========                                                 |  19%  |                                                                    |============                                                |  19%  |                                                                    |============                                                |  20%  |                                                                    |=============                                               |  21%  |                                                                    |=============                                               |  22%  |                                                                    |==============                                              |  23%  |                                                                    |==============                                              |  24%  |                                                                    |===============                                             |  25%  |                                                                    |===============                                             |  26%  |                                                                    |================                                            |  26%  |                                                                    |================                                            |  27%  |                                                                    |=================                                           |  28%  |                                                                    |=================                                           |  29%  |                                                                    |==================                                          |  30%  |                                                                    |===================                                         |  31%  |                                                                    |===================                                         |  32%  |                                                                    |====================                                        |  33%  |                                                                    |====================                                        |  34%  |                                                                    |=====================                                       |  35%  |                                                                    |======================                                      |  36%  |                                                                    |======================                                      |  37%  |                                                                    |=======================                                     |  38%  |                                                                    |=======================                                     |  39%  |                                                                    |========================                                    |  40%  |                                                                    |=========================                                   |  41%  |                                                                    |=========================                                   |  42%  |                                                                    |==========================                                  |  43%  |                                                                    |==========================                                  |  44%  |                                                                    |===========================                                 |  45%  |                                                                    |============================                                |  46%  |                                                                    |============================                                |  47%  |                                                                    |=============================                               |  48%  |                                                                    |=============================                               |  49%  |                                                                    |==============================                              |  50%  |                                                                    |===============================                             |  51%  |                                                                    |===============================                             |  52%  |                                                                    |================================                            |  53%  |                                                                    |================================                            |  54%  |                                                                    |=================================                           |  55%  |                                                                    |==================================                          |  56%  |                                                                    |==================================                          |  57%  |                                                                    |===================================                         |  58%  |                                                                    |===================================                         |  59%  |                                                                    |====================================                        |  60%  |                                                                    |=====================================                       |  61%  |                                                                    |=====================================                       |  62%  |                                                                    |======================================                      |  63%  |                                                                    |======================================                      |  64%  |                                                                    |=======================================                     |  65%  |                                                                    |========================================                    |  66%  |                                                                    |========================================                    |  67%  |                                                                    |=========================================                   |  68%  |                                                                    |=========================================                   |  69%  |                                                                    |==========================================                  |  70%  |                                                                    |===========================================                 |  71%  |                                                                    |===========================================                 |  72%  |                                                                    |============================================                |  73%  |                                                                    |============================================                |  74%  |                                                                    |=============================================               |  75%  |                                                                    |==============================================              |  76%  |                                                                    |==============================================              |  77%  |                                                                    |===============================================             |  78%  |                                                                    |===============================================             |  79%  |                                                                    |================================================            |  80%  |                                                                    |=================================================           |  81%  |                                                                    |=================================================           |  82%  |                                                                    |==================================================          |  83%  |                                                                    |==================================================          |  84%  |                                                                    |===================================================         |  84%  |                                                                    |===================================================         |  85%  |                                                                    |====================================================        |  86%  |                                                                    |====================================================        |  87%  |                                                                    |=====================================================       |  88%  |                                                                    |=====================================================       |  89%  |                                                                    |======================================================      |  90%  |                                                                    |=======================================================     |  91%  |                                                                    |=======================================================     |  92%  |                                                                    |========================================================    |  93%  |                                                                    |========================================================    |  94%  |                                                                    |=========================================================   |  94%  |                                                                    |=========================================================   |  95%  |                                                                    |==========================================================  |  96%  |                                                                    |==========================================================  |  97%  |                                                                    |=========================================================== |  98%  |                                                                    |=========================================================== |  99%  |                                                                    |============================================================|  99%  |                                                                    |============================================================| 100%
  |                                                                    |                                                            |   0%  |                                                                    |=                                                           |   1%  |                                                                    |=                                                           |   2%  |                                                                    |==                                                          |   4%  |                                                                    |===                                                         |   5%  |                                                                    |====                                                        |   6%  |                                                                    |=====                                                       |   8%  |                                                                    |======                                                      |   9%  |                                                                    |======                                                      |  10%  |                                                                    |=======                                                     |  12%  |                                                                    |========                                                    |  13%  |                                                                    |=========                                                   |  15%  |                                                                    |==========                                                  |  16%  |                                                                    |==========                                                  |  17%  |                                                                    |===========                                                 |  19%  |                                                                    |============                                                |  20%  |                                                                    |=============                                               |  21%  |                                                                    |==============                                              |  23%  |                                                                    |==============                                              |  24%  |                                                                    |===============                                             |  25%  |                                                                    |================                                            |  27%  |                                                                    |=================                                           |  28%  |                                                                    |==================                                          |  29%  |                                                                    |==================                                          |  31%  |                                                                    |===================                                         |  32%  |                                                                    |====================                                        |  33%  |                                                                    |=====================                                       |  35%  |                                                                    |======================                                      |  36%  |                                                                    |======================                                      |  37%  |                                                                    |=======================                                     |  39%  |                                                                    |========================                                    |  40%  |                                                                    |=========================                                   |  41%  |                                                                    |==========================                                  |  43%  |                                                                    |==========================                                  |  44%  |                                                                    |===========================                                 |  45%  |                                                                    |============================                                |  47%  |                                                                    |=============================                               |  48%  |                                                                    |==============================                              |  49%  |                                                                    |==============================                              |  51%  |                                                                    |===============================                             |  52%  |                                                                    |================================                            |  53%  |                                                                    |=================================                           |  55%  |                                                                    |==================================                          |  56%  |                                                                    |==================================                          |  57%  |                                                                    |===================================                         |  59%  |                                                                    |====================================                        |  60%  |                                                                    |=====================================                       |  61%  |                                                                    |======================================                      |  63%  |                                                                    |=======================================                     |  64%  |                                                                    |=======================================                     |  66%  |                                                                    |========================================                    |  67%  |                                                                    |=========================================                   |  68%  |                                                                    |=========================================                   |  69%  |                                                                    |==========================================                  |  70%  |                                                                    |==========================================                  |  71%  |                                                                    |===========================================                 |  71%  |                                                                    |===========================================                 |  72%  |                                                                    |============================================                |  73%  |                                                                    |============================================                |  74%  |                                                                    |=============================================               |  75%  |                                                                    |==============================================              |  76%  |                                                                    |===============================================             |  78%  |                                                                    |===============================================             |  79%  |                                                                    |================================================            |  80%  |                                                                    |=================================================           |  82%  |                                                                    |==================================================          |  83%  |                                                                    |===================================================         |  84%  |                                                                    |===================================================         |  86%  |                                                                    |====================================================        |  87%  |                                                                    |=====================================================       |  88%  |                                                                    |======================================================      |  90%  |                                                                    |=======================================================     |  91%  |                                                                    |=======================================================     |  92%  |                                                                    |========================================================    |  94%  |                                                                    |=========================================================   |  95%  |                                                                    |==========================================================  |  96%  |                                                                    |=========================================================== |  98%  |                                                                    |=========================================================== |  99%  |                                                                    |============================================================| 100%

Functions for analyzing PUMS data

The code below shows the functions Catalyst California created to assist in analyzing data for the I&A team. Specifically, the function has three components:

  1. Race recoding - recode racial-ethnic groups including SWANA
  2. Survey return - percent + moe, cv, etc. calculations for total populations
  3. Survey return race - percent + moe, cv, etc. calculations for recoded racial-ethnic groups
# Functions for F5LA Laura ask
# 1) race recode - recode racial-ethnic groups including SWANA
# 2) survey return - percent + moe, cv, etc. calculations for total populations
# 3) survey return race - percent + moe, cv, etc. calculations for recoded racial-ethnic groups


#### Function to recode racial-ethnic groups ####

library(tidyverse)
library(data.table)
library(readxl)

race_recode <- function(people) {

people_recoded<- people %>% 
  mutate(latino=(ifelse(HISP %in% "01", "Not Latinx", "Latinx")))

people_recoded<-people_recoded %>%
  mutate(aian=(ifelse(RACAIAN %in% "0", "Not AIAN", "AIAN Alone or in Combination")))

people_recoded<-people_recoded%>%
  mutate(nhpi=(ifelse(RACPI %in% "1", "NHPI Alone or in Combination",
                      ifelse(RACNH %in% '1', 'NHPI Alone or in Combination', "Not NHPI"))))

# Set up codes for SWANA
# First use ancestry codes to help identify estimated swana pop
## Create list of swana codes for PUMS
pums_swana_list<-list("Algerian","Arab","Assyrian","Bahraini","Berber","Chaldean","Egyptian","Emirati","Iranian","Iraqi","Israeli","Jordanian","Kurdish","Kuwaiti","Lebanese","Libyan","Middle Eastern","Moroccan","North African","Omani","Palestinian","Qatari","Saudi","Syriac","Syrian","Tunisian","Yazidi","Yemeni","Mideast","Saudi Arabian","Arabic","Other Arab","Libyan (2017 or later)","Kuwaiti (2017 or later)","Turkish","Sudanese","Afghan") # 2017 or later needed based on reviewing data dictionary and saw fields for Arabic and Other Arab

## import PUMS codes from data dictionary previously downloaded to this directory from
# https://www2.census.gov/programs-surveys/acs/tech_docs/pums/data_dict/PUMS_Data_Dictionary_2018-2022.csv
# without the name column and remaining columns renamed to Variable, C, Length, Code_1, Code2, Description
pums_codes <- read_excel("PUMS_Data_Dictionary_2018-2022_ANC1P.xlsx")%>%
  mutate_all(as.character) # create this excel document separate by opening PUMS Data Dictionary in excel and deleting everything but ancestry fields -- since ANC1P and ANC2P have same data values no need to do both
pums_codes <- pums_codes %>% dplyr::rename("ANC_Code" = "Code_1")

## filter PUMS codes for swana descriptions based on our swana_list
swana_codes<-pums_codes%>%filter(Description %in% pums_swana_list)

##swana - alone or in combination with another race or latino
people_recoded$swana <- "Not SWANA"
people_recoded$swana[people_recoded$ANC1P%in% swana_codes$ANC_Code| people_recoded$ANC2P%in% swana_codes$ANC_Code] <- "SWANA"
people_recoded$swana <- as.factor(people_recoded$swana)

# code other race groups
people_recoded$race = as.factor(ifelse(people_recoded$RAC1P == "1" & people_recoded$latino =="Not Latinx", "White NL",
                                       ifelse(people_recoded$RAC1P == "1" & people_recoded$latino =="Latinx", "Latinx placeholder",
                                              ifelse(people_recoded$RAC1P == "2" & people_recoded$latino =="Not Latinx", "Black NL",
                                                     ifelse(people_recoded$RAC1P== "3" | people_recoded$RAC1P== "4"|people_recoded$RAC1P== "5", "AIAN placeholder",
                                                            ifelse(people_recoded$RAC1P == "6" & people_recoded$latino =="Not Latinx", "Asian NL",
                                                                   ifelse(people_recoded$RAC1P == "7", "NHPI placeholder",
                                                                          ifelse(people_recoded$RAC1P == "8" & people_recoded$latino =="Not Latinx", "Other NL", 
                                                                                 ifelse(people_recoded$RAC1P== "9" 
                                                                                        & people_recoded$latino =="Not Latinx", "Two or More NL", "Latinx placeholder"))))))))) 

return(people_recoded)
}




#### Function to set up and run survey ####

survey_return <- function(hh) {

  # survey design code
  
  # Define weight variable and population base which will be used in the survey design set up
  ## You must use WGTP (if you are using psam_h06.csv and want housing units, like for Low Quality Housing) or PWGTP (if you want person units, like for Connected under5)
  weight <- 'WGTP' # using WGTP b/c calculating percentage of rent-burdened households
  
  repwlist = rep(paste0("WGTP", 1:80))
  
  # create survey design
  
  hh_geo <- hh %>%               
    as_survey_rep(
      variables = c(geoid, indicator),   # dplyr::select grouping variables
      weights = weight,                       #  weight
      repweights = repwlist,                  # list of replicate weights
      combined_weights = TRUE,                # tells the function that replicate weights are included in the data
      mse = TRUE,                             # tells the function to calc mse
      type="other",                           # statistical method
      scale=4/80,                             # scaling set by ACS
      rscale=rep(1,80)                        # setting specific to ACS-scaling
    )
  
  ###### TOTAL ######
  total <- hh_geo  %>%
    group_by(geoid,indicator) %>%   # group by race cat
    summarise(
      num = survey_total(na.rm=T), # get the (survey weighted) count for the numerators
      rate = survey_mean()) %>%        # get the (survey weighted) proportion for the numerator
    left_join(hh_geo %>%                                        # left join in the denominators
                group_by(geoid) %>%                                     # group by geo
                summarise(pop = survey_total(na.rm=T))) %>%              # get the weighted total for overall geo
    mutate(rate=rate*100,
           rate_moe = rate_se*1.645*100,    # calculate the margin of error for the rate based on se
           rate_cv = ((rate_moe/1.645)/rate) * 100, # calculate cv for rate
           count_moe = num_se*1.645, # calculate moe for numerator count based on se
           count_cv = ((count_moe/1.645)/num) * 100)  # calculate cv for numerator count
  
  return(total)
}




#### Function to set up and run survey for racial-ethnic groups  ####

survey_return_race <- function(hh) {
  
# survey design code

# Define weight variable and population base which will be used in the survey design set up
## You must use WGTP (if you are using psam_h06.csv and want housing units, like for Low Quality Housing) or PWGTP (if you want person units, like for Connected under5)
weight <- 'WGTP' # using PWGTP b/c calculating percentage of under 5 in rent-burdened households (from psam_p06.csv)

repwlist = rep(paste0("WGTP", 1:80))

# create survey design

hh_geo <- hh %>%               
  as_survey_rep(
    variables = c(geoid, indicator, race, latino, aian, nhpi, swana),   # dplyr::select grouping variables
    weights = weight,                       #  weight
    repweights = repwlist,                  # list of replicate weights
    combined_weights = TRUE,                # tells the function that replicate weights are included in the data
    mse = TRUE,                             # tells the function to calc mse
    type="other",                           # statistical method
    scale=4/80,                             # scaling set by ACS
    rscale=rep(1,80)                        # setting specific to ACS-scaling
  )

###### Latino ######
lat <- hh_geo  %>%
  group_by(geoid,latino,indicator) %>%   # group by race cat
  summarise(
    num = survey_total(na.rm=T), # get the (survey weighted) count for the numerators
    rate = survey_mean()) %>%        # get the (survey weighted) proportion for the numerator
  left_join(hh_geo %>%                                        # left join in the denominators
              group_by(geoid,latino) %>%                                     # group by geo
              summarise(pop = survey_total(na.rm=T))) %>%              # get the weighted total for overall geo
  mutate(rate=rate*100,
         rate_moe = rate_se*1.645*100,    # calculate the margin of error for the rate based on se
         rate_cv = ((rate_moe/1.645)/rate) * 100, # calculate cv for rate
         count_moe = num_se*1.645, # calculate moe for numerator count based on se
         count_cv = ((count_moe/1.645)/num) * 100)  # calculate cv for numerator count


###### NHPI ######
nhpi <- hh_geo  %>%
  group_by(geoid,nhpi,indicator) %>%   # group by race cat
  summarise(
    num = survey_total(na.rm=T), # get the (survey weighted) count for the numerators
    rate = survey_mean()) %>%        # get the (survey weighted) proportion for the numerator
  left_join(hh_geo %>%                                        # left join in the denominators
              group_by(geoid,nhpi) %>%                                     # group by geo
              summarise(pop = survey_total(na.rm=T))) %>%              # get the weighted total for overall geo
  mutate(rate=rate*100,
         rate_moe = rate_se*1.645*100,    # calculate the margin of error for the rate based on se
         rate_cv = ((rate_moe/1.645)/rate) * 100, # calculate cv for rate
         count_moe = num_se*1.645, # calculate moe for numerator count based on se
         count_cv = ((count_moe/1.645)/num) * 100)  # calculate cv for numerator count

###### AIAN ######
aian <- hh_geo  %>%
  group_by(geoid,aian,indicator) %>%   # group by race cat
  summarise(
    num = survey_total(na.rm=T), # get the (survey weighted) count for the numerators
    rate = survey_mean()) %>%        # get the (survey weighted) proportion for the numerator
  left_join(hh_geo %>%                                        # left join in the denominators
              group_by(geoid,aian) %>%                                     # group by geo
              summarise(pop = survey_total(na.rm=T))) %>%              # get the weighted total for overall geo
  mutate(rate=rate*100,
         rate_moe = rate_se*1.645*100,    # calculate the margin of error for the rate based on se
         rate_cv = ((rate_moe/1.645)/rate) * 100, # calculate cv for rate
         count_moe = num_se*1.645, # calculate moe for numerator count based on se
         count_cv = ((count_moe/1.645)/num) * 100)  # calculate cv for numerator count

###### SWANA ######
swana <- hh_geo  %>%
  group_by(geoid,swana,indicator) %>%   # group by race cat
  summarise(
    num = survey_total(na.rm=T), # get the (survey weighted) count for the numerators
    rate = survey_mean()) %>%        # get the (survey weighted) proportion for the numerator
  left_join(hh_geo %>%                                        # left join in the denominators
              group_by(geoid,swana) %>%                                     # group by geo
              summarise(pop = survey_total(na.rm=T))) %>%              # get the weighted total for overall geo
  mutate(rate=rate*100,
         rate_moe = rate_se*1.645*100,    # calculate the margin of error for the rate based on se
         rate_cv = ((rate_moe/1.645)/rate) * 100, # calculate cv for rate
         count_moe = num_se*1.645, # calculate moe for numerator count based on se
         count_cv = ((count_moe/1.645)/num) * 100)  # calculate cv for numerator count

###### RACE ######
race <- hh_geo  %>%
  group_by(geoid,race,indicator) %>%   # group by race cat
  summarise(
    num = survey_total(na.rm=T), # get the (survey weighted) count for the numerators
    rate = survey_mean()) %>%        # get the (survey weighted) proportion for the numerator
  left_join(hh_geo %>%                                        # left join in the denominators
              group_by(geoid,race) %>%                                     # group by geo
              summarise(pop = survey_total(na.rm=T))) %>%              # get the weighted total for overall geo
  mutate(rate=rate*100,
         rate_moe = rate_se*1.645*100,    # calculate the margin of error for the rate based on se
         rate_cv = ((rate_moe/1.645)/rate) * 100, # calculate cv for rate
         count_moe = num_se*1.645, # calculate moe for numerator count based on se
         count_cv = ((count_moe/1.645)/num) * 100)  # calculate cv for numerator count

###### TOTAL ######
total <- hh_geo  %>%
  group_by(geoid,indicator) %>%   # group by race cat
  summarise(
    num = survey_total(na.rm=T), # get the (survey weighted) count for the numerators
    rate = survey_mean()) %>%        # get the (survey weighted) proportion for the numerator
  left_join(hh_geo %>%                                        # left join in the denominators
              group_by(geoid) %>%                                     # group by geo
              summarise(pop = survey_total(na.rm=T))) %>%              # get the weighted total for overall geo
  mutate(rate=rate*100,
         rate_moe = rate_se*1.645*100,    # calculate the margin of error for the rate based on se
         rate_cv = ((rate_moe/1.645)/rate) * 100, # calculate cv for rate
         count_moe = num_se*1.645, # calculate moe for numerator count based on se
         count_cv = ((count_moe/1.645)/num) * 100)  # calculate cv for numerator count

####  Format  ####

####  Step 6: format  ####

# rename race name columns as subgroup
total$subgroup = "Total"
total <- total %>% select(geoid, subgroup, everything())

aian <- aian %>% rename(subgroup = aian)
lat <- lat %>% rename(subgroup = latino)
nhpi <- nhpi %>% rename(subgroup = nhpi)
race <- race %>% rename(subgroup = race)
swana <- swana %>% rename(subgroup = swana)

# merge tables, need total
d_long <- rbind(total, aian, lat, race, nhpi, swana) %>%
  filter(  subgroup != "Not AIAN" &
           subgroup != "Not Latinx" &
           subgroup != "Not NHPI" &
           subgroup != "Not SWANA" &
           subgroup != "AIAN placeholder" &
           subgroup != "NHPI placeholder" &
           subgroup != "Latinx placeholder")

d_long <- as.data.frame(d_long)

return(d_long)
}

Poverty Analysis Code

Below is the code for each of the poverty cuts analyzed by Catalyst California.

Poverty by SPA 2023: 138% FPL

# Households with children under 5 earning incomes below 138% of the federal poverty level by SPA
# as percentage of all eligible  LA County households under 5, 2023
 
# Data Dictionary: https://www2.census.gov/programs-surveys/acs/tech_docs/pums/data_dict/PUMS_Data_Dictionary_2019-2023.pdf

# library(tidyverse)
# library(data.table)
# library(readxl)
# library(tidycensus)
# library(srvyr)
# library(stringr)
# library(sf)
# 
# 
# #SOURCE from the script that has: styling, packages, dbconnection, colors
# source("W:\\RDA Team\\R\\credentials_source.R")
# 
# 
# #### Step 1: load the data ####
# 
# # PUMS Data (previously downloaded from https://www2.census.gov/programs-surveys/acs/data/pums/)
# 
# root <- "W:/Data/Demographics/PUMS/"
# 
# # Load the people PUMS data
# 
# people <- fread(paste0(root, "CA_2023/psam_p06.csv"), header = TRUE, data.table = FALSE,
#                 colClasses = list(character = c("PUMA", "ANC1P", "ANC2P", "HISP", "RAC1P", "RAC2P", "RAC3P", "RACAIAN", "RACPI", "RACNH")))
# 
# 
# # Load the housing PUMS data
# housing <- fread(paste0(root, "CA_2023/psam_h06.csv"), header = TRUE, data.table = FALSE,
#                  colClasses = list(character = c("PUMA")))
# 
# 
# ####  Step 2: filter households eligible for calculation  #### 
# 
# ## Select LA County people
# eligible_hhs <- people %>% 
#   
#   #filtering for universe and LA county
#   filter(!is.na(POVPIP) & grepl('037', PUMA))   %>% 
# 
#     # join their housing info  
#     left_join(housing %>% filter(grepl('037', PUMA)), 
#               by = c("SERIALNO", "PUMA")) %>%
#   
#   #remove records with no weights
#   filter(!is.na(WGTP)) %>%
#   
#   #filter for age 0-5 and select distinct households
#   filter(AGEP < 5) %>% distinct(SERIALNO, .keep_all = TRUE)
# 
# 
# #### Step 3: get PUMA SPA xwalk and join ####
# 
# #get puma-spa xwalk (copy saved to postgres database)
# conn <- connect_to_db("f5la_v2")
# puma_spa_xwalk <- st_read(conn, query = "SELECT * FROM puma_spa_xwalk_2022")
# dbDisconnect(conn)
# 
# # join
# hh <- eligible_hhs %>% left_join(puma_spa_xwalk, by = "PUMA", relationship = "many-to-many")
# 
# 
# #### Step 4 run survey function and format
# 
# # prep data and define indicator
# hh$geoid <- hh$SPA
# hh<-hh%>%
#   mutate(indicator=(ifelse(hh$POVPIP <= 138, "at or below 138% of the federal poverty level", "above 138% of the federal poverty level")))
# 
# source("ask_functions.R")
# total <- survey_return(hh)
# 
# # select burdened and not NA
# d_long <- total %>% filter(indicator == "at or below 138% of the federal poverty level" & !is.na(geoid))
# 
# # make data frame
# d_long <- as.data.frame(d_long)
# 
# # #write to one bigger csv with all data years
# write.csv(d_long, file = "CSVs/poverty_spa_2023_138pct.csv")

Poverty by SPA 2022: 138% FPL

# Households with children under 5 earning 138% of the federal poverty level by SPA
# as percentage of all LA County households under 5, 2022
 
# Data Dictionary: https://www2.census.gov/programs-surveys/acs/tech_docs/pums/data_dict/PUMS_Data_Dictionary_2022.pdf

# library(tidyverse)
# library(data.table)
# library(readxl)
# library(tidycensus)
# library(srvyr)
# library(stringr)
# library(sf)
# 
# 
# #SOURCE from the script that has: styling, packages, dbconnection, colors
# source("W:\\RDA Team\\R\\credentials_source.R")
# 
# 
# #### Step 1: load the data ####
# 
# # PUMS Data (previously downloaded from https://www2.census.gov/programs-surveys/acs/data/pums/)
# root <- "W:/Data/Demographics/PUMS/"
# 
# # Load the people PUMS data
# people <- fread(paste0(root, "CA_2022/psam_p06.csv"), header = TRUE, data.table = FALSE,
#                 colClasses = list(character = c("PUMA", "ANC1P", "ANC2P", "HISP", "RAC1P", "RAC2P", "RAC3P", "RACAIAN", "RACPI", "RACNH")))
# 
# 
# # Load the housing PUMS data
# housing <- fread(paste0(root, "CA_2022/psam_h06.csv"), header = TRUE, data.table = FALSE,
#                  colClasses = list(character = c("PUMA")))
# 
# 
# ####  Step 2: filter households eligible for calculation  #### 
# 
# ## Select LA County people
# eligible_hhs <- people %>% 
#   
#   #filtering for universe and LA county
#   filter(!is.na(POVPIP) & grepl('037', PUMA))   %>% 
# 
#     # join their housing info  
#     left_join(housing %>% filter(grepl('037', PUMA)), 
#               by = c("SERIALNO", "PUMA")) %>%
#   
#   #remove records with no weights
#   filter(!is.na(WGTP)) %>%
#   
#   #filter for age 0-5 and select distinct households
#   filter(AGEP < 5) %>% distinct(SERIALNO, .keep_all = TRUE)
# 
# 
# #### Step 3: get PUMA SPA xwalk and join ####
# 
# #get puma-spa xwalk
# conn <- connect_to_db("f5la_v2")
# puma_spa_xwalk <- st_read(conn, query = "SELECT * FROM puma_spa_xwalk_2022")
# dbDisconnect(conn)
# 
# # join
# hh <- eligible_hhs %>% left_join(puma_spa_xwalk, by = "PUMA", relationship = "many-to-many")
# 
# 
# #### Step 4 run survey function and format
# 
# # prep data and define indicator
# hh$geoid <- hh$SPA
# hh<-hh%>%
#   mutate(indicator=(ifelse(hh$POVPIP <= 138, "at or below 138% of the federal poverty level", "above 138% of the federal poverty level")))
# 
# source("ask_functions.R")
# total <- survey_return(hh)
# 
# # select burdened and not NA
# d_long <- total %>% filter(indicator == "at or below 138% of the federal poverty level" & !is.na(geoid))
# 
# # make data frame
# d_long <- as.data.frame(d_long)
# 
# # #write to one bigger csv with all data years
# write.csv(d_long, file = "CSVs/poverty_spa_2022_138pct.csv")

Poverty by Race 2023: 138% FPL

# Households with children under 5 earning 138% of the federal poverty level
# among all eligible LA County households with children under 5, 2023
# Data Dictionary: https://www2.census.gov/programs-surveys/acs/tech_docs/pums/data_dict/PUMS_Data_Dictionary_2019-2023.pdf

# library(tidyverse)
# library(data.table)
# library(readxl)
# library(tidycensus)
# library(srvyr)
# library(stringr)
# 
# #SOURCE from the script that has: styling, packages, dbconnection, colors
# source("W:\\RDA Team\\R\\credentials_source.R")
# 
# 
# #### Step 1 load the data ####
# 
# # PUMS Data (previously downloaded from https://www2.census.gov/programs-surveys/acs/data/pums/)
# root <- "W:/Data/Demographics/PUMS/"
# 
# # Load the people PUMS data
# people <- fread(paste0(root, "CA_2019_2023/psam_p06.csv"), header = TRUE, data.table = FALSE,
#                 colClasses = list(character = c("PUMA", "ANC1P", "ANC2P", "HISP", "RAC1P", "RAC2P", "RAC3P", "RACAIAN", "RACPI", "RACNH")))
# 
# 
# # Load the housing PUMS data
# housing <- fread(paste0(root, "CA_2019_2023/psam_h06.csv"), header = TRUE, data.table = FALSE,
#                  colClasses = list(character = c("PUMA")))
# 
# 
# #### Step 2 filter for LA & Recode racial-ethnic groups ####
# 
# people <- people %>% filter(grepl('037', PUMA)) 
# 
# #SOURCE recode function
# source("ask_functions.R")
# people_recoded <- race_recode(people)
# 
# 
# ####  Step 3: filter LA County households eligible for poverty calculation  #### 
# 
# ## Select LA County households with income-to-poverty ratios (universe) by
# 
# eligible_hhs <- people_recoded %>%
#   
#   #filtering for poverty eligible and LA county  
#   filter(!is.na(POVPIP) & (grepl('037', PUMA))) %>%
#   
#   #join households         
#   left_join(housing, by = c("SERIALNO", "PUMA")) %>%
#   
#   #remove records with no weights
#   filter(!is.na(WGTP)) %>%
#   
#   #filter for age 0-5 and select distinct households
#   filter(AGEP < 5) %>% distinct(SERIALNO, .keep_all = TRUE)
# 
# 
# #### Step 4: Set up surveys and calculate percentages by race/ethnicity
# 
# # survey design code
# 
# # Define weight variable and population base which will be used in the survey design set up
# ## You must use WGTP (if you are using psam_h06.csv and want housing units, like for Low Quality Housing) or PWGTP (if you want person units, like for Connected under5)
# weight <- 'WGTP' # using WGTP b/c calculating percentage of rent-burdened households
# 
# repwlist = rep(paste0("WGTP", 1:80))
# 
# # prep data and identify/calculate indicator
# hh <- eligible_hhs
# hh$geoid <- "037"
# 
# hh<-hh%>%
#   mutate(indicator=(ifelse(hh$POVPIP <= 138, "at or below 138% of the federal poverty level", "above 138% of the federal poverty level")))
# 
# # create survey design
# 
# hh_county <- hh %>%               
#   as_survey_rep(
#     variables = c(geoid, indicator, race, latino, aian, nhpi, swana),   # dplyr::select grouping variables
#     weights = weight,                       #  weight
#     repweights = repwlist,                  # list of replicate weights
#     combined_weights = TRUE,                # tells the function that replicate weights are included in the data
#     mse = TRUE,                             # tells the function to calc mse
#     type="other",                           # statistical method
#     scale=4/80,                             # scaling set by ACS
#     rscale=rep(1,80)                        # setting specific to ACS-scaling
#   )
# 
# 
# ###### Latino ######
# lat <- hh_county  %>%
#   group_by(geoid,latino,indicator) %>%   # group by race cat
#   summarise(
#     num = survey_total(na.rm=T), # get the (survey weighted) count for the numerators
#     rate = survey_mean()) %>%        # get the (survey weighted) proportion for the numerator
#   left_join(hh_county %>%                                        # left join in the denominators
#               group_by(geoid,latino) %>%                                     # group by geo
#               summarise(pop = survey_total(na.rm=T))) %>%              # get the weighted total for overall geo
#   mutate(rate=rate*100,
#          rate_moe = rate_se*1.645*100,    # calculate the margin of error for the rate based on se
#          rate_cv = ((rate_moe/1.645)/rate) * 100, # calculate cv for rate
#          count_moe = num_se*1.645, # calculate moe for numerator count based on se
#          count_cv = ((count_moe/1.645)/num) * 100)  # calculate cv for numerator count
# 
# 
# ###### NHPI ######
# nhpi <- hh_county  %>%
#   group_by(geoid,nhpi,indicator) %>%   # group by race cat
#   summarise(
#     num = survey_total(na.rm=T), # get the (survey weighted) count for the numerators
#     rate = survey_mean()) %>%        # get the (survey weighted) proportion for the numerator
#   left_join(hh_county %>%                                        # left join in the denominators
#               group_by(geoid,nhpi) %>%                                     # group by geo
#               summarise(pop = survey_total(na.rm=T))) %>%              # get the weighted total for overall geo
#   mutate(rate=rate*100,
#          rate_moe = rate_se*1.645*100,    # calculate the margin of error for the rate based on se
#          rate_cv = ((rate_moe/1.645)/rate) * 100, # calculate cv for rate
#          count_moe = num_se*1.645, # calculate moe for numerator count based on se
#          count_cv = ((count_moe/1.645)/num) * 100)  # calculate cv for numerator count
# 
# 
# ###### AIAN ######
# aian <- hh_county  %>%
#   group_by(geoid,aian,indicator) %>%   # group by race cat
#   summarise(
#     num = survey_total(na.rm=T), # get the (survey weighted) count for the numerators
#     rate = survey_mean()) %>%        # get the (survey weighted) proportion for the numerator
#   left_join(hh_county %>%                                        # left join in the denominators
#               group_by(geoid,aian) %>%                                     # group by geo
#               summarise(pop = survey_total(na.rm=T))) %>%              # get the weighted total for overall geo
#   mutate(rate=rate*100,
#          rate_moe = rate_se*1.645*100,    # calculate the margin of error for the rate based on se
#          rate_cv = ((rate_moe/1.645)/rate) * 100, # calculate cv for rate
#          count_moe = num_se*1.645, # calculate moe for numerator count based on se
#          count_cv = ((count_moe/1.645)/num) * 100)  # calculate cv for numerator count
# 
# 
# ###### SWANA ######
# swana <- hh_county  %>%
#   group_by(geoid,swana,indicator) %>%   # group by race cat
#   summarise(
#     num = survey_total(na.rm=T), # get the (survey weighted) count for the numerators
#     rate = survey_mean()) %>%        # get the (survey weighted) proportion for the numerator
#   left_join(hh_county %>%                                        # left join in the denominators
#               group_by(geoid,swana) %>%                                     # group by geo
#               summarise(pop = survey_total(na.rm=T))) %>%              # get the weighted total for overall geo
#   mutate(rate=rate*100,
#          rate_moe = rate_se*1.645*100,    # calculate the margin of error for the rate based on se
#          rate_cv = ((rate_moe/1.645)/rate) * 100, # calculate cv for rate
#          count_moe = num_se*1.645, # calculate moe for numerator count based on se
#          count_cv = ((count_moe/1.645)/num) * 100)  # calculate cv for numerator count
# 
# 
# ###### RACE ######
# race <- hh_county  %>%
#   group_by(geoid,race,indicator) %>%   # group by race cat
#   summarise(
#     num = survey_total(na.rm=T), # get the (survey weighted) count for the numerators
#     rate = survey_mean()) %>%        # get the (survey weighted) proportion for the numerator
#   left_join(hh_county %>%                                        # left join in the denominators
#               group_by(geoid,race) %>%                                     # group by geo
#               summarise(pop = survey_total(na.rm=T))) %>%              # get the weighted total for overall geo
#   mutate(rate=rate*100,
#          rate_moe = rate_se*1.645*100,    # calculate the margin of error for the rate based on se
#          rate_cv = ((rate_moe/1.645)/rate) * 100, # calculate cv for rate
#          count_moe = num_se*1.645, # calculate moe for numerator count based on se
#          count_cv = ((count_moe/1.645)/num) * 100)  # calculate cv for numerator count
# 
# 
# ###### TOTAL ######
# total <- hh_county  %>%
#   group_by(geoid,indicator) %>%   # group by race cat
#   summarise(
#     num = survey_total(na.rm=T), # get the (survey weighted) count for the numerators
#     rate = survey_mean()) %>%        # get the (survey weighted) proportion for the numerator
#   left_join(hh_county %>%                                        # left join in the denominators
#               group_by(geoid) %>%                                     # group by geo
#               summarise(pop = survey_total(na.rm=T))) %>%              # get the weighted total for overall geo
#   mutate(rate=rate*100,
#          rate_moe = rate_se*1.645*100,    # calculate the margin of error for the rate based on se
#          rate_cv = ((rate_moe/1.645)/rate) * 100, # calculate cv for rate
#          count_moe = num_se*1.645, # calculate moe for numerator count based on se
#          count_cv = ((count_moe/1.645)/num) * 100)  # calculate cv for numerator count
# 
# 
# ####  Step 5: format  ####
# 
# # rename race name columns as subgroup
# total$subgroup = "Total"
# total <- total %>% select(geoid, subgroup, everything())
# 
# aian <- aian %>% rename(subgroup = aian)
# lat <- lat %>% rename(subgroup = latino)
# nhpi <- nhpi %>% rename(subgroup = nhpi)
# race <- race %>% rename(subgroup = race)
# swana <- swana %>% rename(subgroup = swana)
# 
# # merge tables except for bipoc - need total
# d_long <- rbind(total, aian, lat, race, nhpi, swana) %>%
#   filter(indicator == "at or below 138% of the federal poverty level" & 
#            subgroup != "Not AIAN" &
#            subgroup != "Not Latinx" &
#            subgroup != "Not NHPI" &
#            subgroup != "Not SWANA" &
#            subgroup != "AIAN placeholder" &
#            subgroup != "NHPI placeholder" &
#            subgroup != "Latinx placeholder")
# 
# d_long <- as.data.frame(d_long)
# d_long$geoid <- "06037"
# d_long <- d_long %>% select(geoid, everything())
# 
# # #write to one bigger csv with all data years
# write.csv(d_long, file = "CSVs/poverty_race_2019_23_138pct.csv")
# 
# 

Poverty by Race 2022: 138% FPL

# Households with children under 5 earning 138% of the federal poverty level by race
# among all eligible LA County households with children under 5, 2022
# Data Dictionary: https://www2.census.gov/programs-surveys/acs/tech_docs/pums/data_dict/PUMS_Data_Dictionary_2018-2022.pdf

# library(tidyverse)
# library(data.table)
# library(readxl)
# library(tidycensus)
# library(srvyr)
# library(stringr)
# 
# #SOURCE from the script that has: styling, packages, dbconnection, colors
# source("W:\\RDA Team\\R\\credentials_source.R")
# 
# 
# #### Step 1 load the data ####
# 
# # PUMS Data (previously downloaded from https://www2.census.gov/programs-surveys/acs/data/pums/)
# root <- "W:/Data/Demographics/PUMS/"
# 
# # Load the people PUMS data
# people <- fread(paste0(root, "CA_2018_2022/psam_p06.csv"), header = TRUE, data.table = FALSE,
#                 colClasses = list(character = c("PUMA10", "PUMA20", "ANC1P", "ANC2P", "HISP", "RAC1P", "RAC2P", "RAC3P", "RACAIAN", "RACPI", "RACNH")))
# 
# 
# # Load the housing PUMS data
# housing <- fread(paste0(root, "CA_2018_2022/psam_h06.csv"), header = TRUE, data.table = FALSE,
#                  colClasses = list(character = c("PUMA10", "PUMA20")))
# 
# 
# #### Step 2 filter for LA & Recode racial-ethnic groups ####
# 
# people <- people %>% filter(grepl('037', PUMA20) | grepl('037', PUMA10)) 
# 
# #SOURCE recode function
# source("ask_functions.R")
# people_recoded <- race_recode(people)
# 
# 
# ####  Step 3: filter LA County households eligible for poverty calculation  #### 
# 
# ## Select LA County households with income-to-poverty ratios (universe) by
# 
# eligible_hhs <- people_recoded %>%
#   
#   #filtering for poverty eligible and LA county  
#   filter(!is.na(POVPIP) & (grepl('037', PUMA20) | grepl('037', PUMA10))) %>%
#   
#   #join households         
#   left_join(housing, by = c("SERIALNO", "PUMA10", "PUMA20")) %>%
#   
#   #remove records with no weights
#   filter(!is.na(WGTP)) %>%
#   
#   #filter for age 0-5 and select distinct households
#   filter(AGEP < 5) %>% distinct(SERIALNO, .keep_all = TRUE)
# 
# 
# #### Step 4: Set up surveys and calculate percentages by race/ethnicity
# 
# # survey design code
# 
# # Define weight variable and population base which will be used in the survey design set up
# ## You must use WGTP (if you are using psam_h06.csv and want housing units, like for Low Quality Housing) or PWGTP (if you want person units, like for Connected under5)
# weight <- 'WGTP' # using WGTP b/c calculating percentage of rent-burdened households
# 
# repwlist = rep(paste0("WGTP", 1:80))
# 
# # prep data and identify/calculate indicator
# hh <- eligible_hhs
# hh$geoid <- "037"
# 
# hh<-hh%>%
#   mutate(indicator=(ifelse(hh$POVPIP <= 138, "at or below 138% of the federal poverty level", "above 138% of the federal poverty level")))
# 
# # create survey design
# 
# hh_county <- hh %>%               
#   as_survey_rep(
#     variables = c(geoid, indicator, race, latino, aian, nhpi, swana),   # dplyr::select grouping variables
#     weights = weight,                       #  weight
#     repweights = repwlist,                  # list of replicate weights
#     combined_weights = TRUE,                # tells the function that replicate weights are included in the data
#     mse = TRUE,                             # tells the function to calc mse
#     type="other",                           # statistical method
#     scale=4/80,                             # scaling set by ACS
#     rscale=rep(1,80)                        # setting specific to ACS-scaling
#   )
# 
# 
# ###### Latino ######
# lat <- hh_county  %>%
#   group_by(geoid,latino,indicator) %>%   # group by race cat
#   summarise(
#     num = survey_total(na.rm=T), # get the (survey weighted) count for the numerators
#     rate = survey_mean()) %>%        # get the (survey weighted) proportion for the numerator
#   left_join(hh_county %>%                                        # left join in the denominators
#               group_by(geoid,latino) %>%                                     # group by geo
#               summarise(pop = survey_total(na.rm=T))) %>%              # get the weighted total for overall geo
#   mutate(rate=rate*100,
#          rate_moe = rate_se*1.645*100,    # calculate the margin of error for the rate based on se
#          rate_cv = ((rate_moe/1.645)/rate) * 100, # calculate cv for rate
#          count_moe = num_se*1.645, # calculate moe for numerator count based on se
#          count_cv = ((count_moe/1.645)/num) * 100)  # calculate cv for numerator count
# 
# 
# ###### NHPI ######
# nhpi <- hh_county  %>%
#   group_by(geoid,nhpi,indicator) %>%   # group by race cat
#   summarise(
#     num = survey_total(na.rm=T), # get the (survey weighted) count for the numerators
#     rate = survey_mean()) %>%        # get the (survey weighted) proportion for the numerator
#   left_join(hh_county %>%                                        # left join in the denominators
#               group_by(geoid,nhpi) %>%                                     # group by geo
#               summarise(pop = survey_total(na.rm=T))) %>%              # get the weighted total for overall geo
#   mutate(rate=rate*100,
#          rate_moe = rate_se*1.645*100,    # calculate the margin of error for the rate based on se
#          rate_cv = ((rate_moe/1.645)/rate) * 100, # calculate cv for rate
#          count_moe = num_se*1.645, # calculate moe for numerator count based on se
#          count_cv = ((count_moe/1.645)/num) * 100)  # calculate cv for numerator count
# 
# 
# ###### AIAN ######
# aian <- hh_county  %>%
#   group_by(geoid,aian,indicator) %>%   # group by race cat
#   summarise(
#     num = survey_total(na.rm=T), # get the (survey weighted) count for the numerators
#     rate = survey_mean()) %>%        # get the (survey weighted) proportion for the numerator
#   left_join(hh_county %>%                                        # left join in the denominators
#               group_by(geoid,aian) %>%                                     # group by geo
#               summarise(pop = survey_total(na.rm=T))) %>%              # get the weighted total for overall geo
#   mutate(rate=rate*100,
#          rate_moe = rate_se*1.645*100,    # calculate the margin of error for the rate based on se
#          rate_cv = ((rate_moe/1.645)/rate) * 100, # calculate cv for rate
#          count_moe = num_se*1.645, # calculate moe for numerator count based on se
#          count_cv = ((count_moe/1.645)/num) * 100)  # calculate cv for numerator count
# 
# 
# ###### SWANA ######
# swana <- hh_county  %>%
#   group_by(geoid,swana,indicator) %>%   # group by race cat
#   summarise(
#     num = survey_total(na.rm=T), # get the (survey weighted) count for the numerators
#     rate = survey_mean()) %>%        # get the (survey weighted) proportion for the numerator
#   left_join(hh_county %>%                                        # left join in the denominators
#               group_by(geoid,swana) %>%                                     # group by geo
#               summarise(pop = survey_total(na.rm=T))) %>%              # get the weighted total for overall geo
#   mutate(rate=rate*100,
#          rate_moe = rate_se*1.645*100,    # calculate the margin of error for the rate based on se
#          rate_cv = ((rate_moe/1.645)/rate) * 100, # calculate cv for rate
#          count_moe = num_se*1.645, # calculate moe for numerator count based on se
#          count_cv = ((count_moe/1.645)/num) * 100)  # calculate cv for numerator count
# 
# 
# ###### RACE ######
# race <- hh_county  %>%
#   group_by(geoid,race,indicator) %>%   # group by race cat
#   summarise(
#     num = survey_total(na.rm=T), # get the (survey weighted) count for the numerators
#     rate = survey_mean()) %>%        # get the (survey weighted) proportion for the numerator
#   left_join(hh_county %>%                                        # left join in the denominators
#               group_by(geoid,race) %>%                                     # group by geo
#               summarise(pop = survey_total(na.rm=T))) %>%              # get the weighted total for overall geo
#   mutate(rate=rate*100,
#          rate_moe = rate_se*1.645*100,    # calculate the margin of error for the rate based on se
#          rate_cv = ((rate_moe/1.645)/rate) * 100, # calculate cv for rate
#          count_moe = num_se*1.645, # calculate moe for numerator count based on se
#          count_cv = ((count_moe/1.645)/num) * 100)  # calculate cv for numerator count
# 
# 
# ###### TOTAL ######
# total <- hh_county  %>%
#   group_by(geoid,indicator) %>%   # group by race cat
#   summarise(
#     num = survey_total(na.rm=T), # get the (survey weighted) count for the numerators
#     rate = survey_mean()) %>%        # get the (survey weighted) proportion for the numerator
#   left_join(hh_county %>%                                        # left join in the denominators
#               group_by(geoid) %>%                                     # group by geo
#               summarise(pop = survey_total(na.rm=T))) %>%              # get the weighted total for overall geo
#   mutate(rate=rate*100,
#          rate_moe = rate_se*1.645*100,    # calculate the margin of error for the rate based on se
#          rate_cv = ((rate_moe/1.645)/rate) * 100, # calculate cv for rate
#          count_moe = num_se*1.645, # calculate moe for numerator count based on se
#          count_cv = ((count_moe/1.645)/num) * 100)  # calculate cv for numerator count
# 
# 
# ####  Step 5: format  ####
# 
# # rename race name columns as subgroup
# total$subgroup = "Total"
# total <- total %>% select(geoid, subgroup, everything())
# 
# aian <- aian %>% rename(subgroup = aian)
# lat <- lat %>% rename(subgroup = latino)
# nhpi <- nhpi %>% rename(subgroup = nhpi)
# race <- race %>% rename(subgroup = race)
# swana <- swana %>% rename(subgroup = swana)
# 
# # merge tables except for bipoc - need total
# d_long <- rbind(total, aian, lat, race, nhpi, swana) %>%
#   filter(indicator == "at or below 138% of the federal poverty level" & 
#            subgroup != "Not AIAN" &
#            subgroup != "Not Latinx" &
#            subgroup != "Not NHPI" &
#            subgroup != "Not SWANA" &
#            subgroup != "AIAN placeholder" &
#            subgroup != "NHPI placeholder" &
#            subgroup != "Latinx placeholder")
# 
# d_long <- as.data.frame(d_long)
# d_long$geoid <- "06037"
# d_long <- d_long %>% select(geoid, everything())
# 
# # #write to one bigger csv with all data years
# write.csv(d_long, file = "CSVs/poverty_race_2018_22_138pct.csv")

Poverty by County 2023: 138% FPL

# Households with children under 5 earning 138% of the federal poverty level
# as a percentage of all eligible LA County households with children under 5, 2023

# Data Dictionary: www2.census.gov/programs-surveys/acs/tech_docs/pums/data_dict/PUMS_Data_Dictionary_2023.pdf

# library(tidyverse)
# library(data.table)
# library(readxl)
# library(tidycensus)
# library(srvyr)
# library(stringr)
# 
# #SOURCE from the script that has: styling, packages, dbconnection, colors
# source("W:\\RDA Team\\R\\credentials_source.R")
# 
# 
# #### Step 1: load the data ####
# 
# # PUMS Data (previously downloaded from https://www2.census.gov/programs-surveys/acs/data/pums/)
# root <- "W:/Data/Demographics/PUMS/"
# 
# # Load the people PUMS data
# people <- fread(paste0(root, "CA_2023/psam_p06.csv"), header = TRUE, data.table = FALSE,
#                 colClasses = list(character = c("PUMA", "ANC1P", "ANC2P", "HISP", "RAC1P", "RAC2P", "RAC3P", "RACAIAN", "RACPI", "RACNH")))
# 
# 
# # Load the housing PUMS data
# housing <- fread(paste0(root, "CA_2023/psam_h06.csv"), header = TRUE, data.table = FALSE,
#                  colClasses = list(character = c("PUMA")))
# 
# 
# ####  Step 2: filter households eligible for calculation  #### 
# 
# ## Select LA County people
# eligible_hhs <- people %>% 
#   
#   #filtering for universe and LA county
#   filter(!is.na(POVPIP) & grepl('037', PUMA))   %>% 
#   
#   # join their housing info  
#   left_join(housing %>% filter(grepl('037', PUMA)), 
#             by = c("SERIALNO", "PUMA")) %>%
#   
#   #remove records with no weights
#   filter(!is.na(WGTP)) %>%
#   
#   #filter for age 0-5 and select distinct households
#   filter(AGEP < 5) %>% distinct(SERIALNO, .keep_all = TRUE)
# 
# 
# ####  Step 3: set up and run survey and format  #### 
# 
# # add geoid and indicator
# eligible_hhs$geoid <- "037"
# eligible_hhs$indicator=(ifelse(eligible_hhs$POVPIP <= 138, "at or below 138% of the federal poverty level", "above 138% of the federal poverty level"))
# 
# 
# #SOURCE survey return function
# source("ask_functions.R")
# total <- survey_return(eligible_hhs)
# 
# 
# # select burdened and not NA
# d_long <- total %>% filter(indicator == "at or below 138% of the federal poverty level" & !is.na(geoid))
# 
# # make data frame
# d_long <- as.data.frame(d_long)
# 
# 
# 
# 
# #### Step 4: Repeat steps 2 and 3 above without the 0-5 filter ####
# 
# ## Select LA County people
# eligible_hhs2 <- people %>% 
#   
#   #filtering for universe and LA county
#   filter(!is.na(POVPIP) & grepl('037', PUMA))   %>% 
#   
#   # join their housing info  
#   left_join(housing %>% filter(grepl('037', PUMA)), 
#             by = c("SERIALNO", "PUMA")) %>%
#   
#   #remove records with no weights
#   filter(!is.na(WGTP)) %>%
#   
#   #select distinct households
#   distinct(SERIALNO, .keep_all = TRUE)
# 
# # add geoid and indicator
# eligible_hhs2$geoid <- "037"
# eligible_hhs2$indicator=(ifelse(eligible_hhs2$POVPIP <= 138, "at or below 138% of the federal poverty level", "above 138% of the federal poverty level"))
# 
# 
# #run survey return function
# total <- survey_return(eligible_hhs2)
# 
# # select burdened and not NA
# d_long2 <- total %>% filter(indicator == "at or below 138% of the federal poverty level" & !is.na(geoid))
# 
# # make data frame
# d_long2 <- as.data.frame(d_long2)
# 
# # bind both data frames in final
# d_final <- rbind(d_long, d_long2)
# 
# # #write to one bigger csv with all data years
# write.csv(d_final, file = "CSVs/poverty_county_2023_138pct.csv")

Poverty by County 2022: 138% FPL

# Households with children under 5 earning 138% of the federal poverty level
# as a percentage of all eligible LA County households with children under 5, 2023

# Data Dictionary: https://www2.census.gov/programs-surveys/acs/tech_docs/pums/data_dict/PUMS_Data_Dictionary_2022.pdf
# 
# library(tidyverse)
# library(data.table)
# library(readxl)
# library(tidycensus)
# library(srvyr)
# library(stringr)
# 
# #SOURCE from the script that has: styling, packages, dbconnection, colors
# source("W:\\RDA Team\\R\\credentials_source.R")
# 
# 
# #### Step 1: load the data ####
# 
# # PUMS Data (previously downloaded from https://www2.census.gov/programs-surveys/acs/data/pums/)
# root <- "W:/Data/Demographics/PUMS/"
# 
# # Load the people PUMS data
# people <- fread(paste0(root, "CA_2022/psam_p06.csv"), header = TRUE, data.table = FALSE,
#                 colClasses = list(character = c("PUMA", "ANC1P", "ANC2P", "HISP", "RAC1P", "RAC2P", "RAC3P", "RACAIAN", "RACPI", "RACNH")))
# 
# 
# # Load the housing PUMS data
# housing <- fread(paste0(root, "CA_2022/psam_h06.csv"), header = TRUE, data.table = FALSE,
#                  colClasses = list(character = c("PUMA")))
# 
# 
# ####  Step 2: filter households eligible for calculation  #### 
# 
# ## Select LA County people
# eligible_hhs <- people %>% 
#   
#   #filtering for universe and LA county
#   filter(!is.na(POVPIP) & grepl('037', PUMA))   %>% 
#   
#   # join their housing info  
#   left_join(housing %>% filter(grepl('037', PUMA)), 
#             by = c("SERIALNO", "PUMA")) %>%
#   
#   #remove records with no weights
#   filter(!is.na(WGTP)) %>%
#   
#   #filter for age 0-5 and select distinct households
#   filter(AGEP < 5) %>% distinct(SERIALNO, .keep_all = TRUE)
# 
# 
# ####  Step 3: set up and run survey and format  #### 
# 
# # add geoid and indicator
# eligible_hhs$geoid <- "037"
# eligible_hhs$indicator=(ifelse(eligible_hhs$POVPIP <= 138, "at or below 138% of the federal poverty level", "above 138% of the federal poverty level"))
# 
# 
# #SOURCE survey return function
# source("ask_functions.R")
# total <- survey_return(eligible_hhs)
# 
# 
# # select burdened and not NA
# d_long <- total %>% filter(indicator == "at or below 138% of the federal poverty level" & !is.na(geoid))
# 
# # make data frame
# d_long <- as.data.frame(d_long)
# 
# 
# 
# 
# #### Step 4: Repeat steps 2 and 3 above without the 0-5 filter ####
# 
# ## Select LA County people
# eligible_hhs2 <- people %>% 
#   
#   #filtering for universe and LA county
#   filter(!is.na(POVPIP) & grepl('037', PUMA))   %>% 
#   
#   # join their housing info  
#   left_join(housing %>% filter(grepl('037', PUMA)), 
#             by = c("SERIALNO", "PUMA")) %>%
#   
#   #remove records with no weights
#   filter(!is.na(WGTP)) %>%
#   
#   #select distinct households
#   distinct(SERIALNO, .keep_all = TRUE)
# 
# # add geoid and indicator
# eligible_hhs2$geoid <- "037"
# eligible_hhs2$indicator=(ifelse(eligible_hhs2$POVPIP <= 138, "at or below 138% of the federal poverty level", "above 138% of the federal poverty level"))
# 
# 
# #run survey return function
# total <- survey_return(eligible_hhs2)
# 
# # select burdened and not NA
# d_long2 <- total %>% filter(indicator == "at or below 138% of the federal poverty level" & !is.na(geoid))
# 
# # make data frame
# d_long2 <- as.data.frame(d_long2)
# 
# # bind both data frames in final
# d_final <- rbind(d_long, d_long2)
# 
# # #write to one bigger csv with all data years
# write.csv(d_final, file = "CSVs/poverty_county_2022_138pct.csv")

Poverty by SPA 2023: 100% FPL

# Households with children under 5 with incomes below poverty by SPA
# as percentage of all LA County households under 5, 2023
 
# Data Dictionary: https://www2.census.gov/programs-surveys/acs/tech_docs/pums/data_dict/PUMS_Data_Dictionary_2019-2023.pdf

# library(tidyverse)
# library(data.table)
# library(readxl)
# library(tidycensus)
# library(srvyr)
# library(stringr)
# library(sf)
# 
# 
# #SOURCE from the script that has: styling, packages, dbconnection, colors
# source("W:\\RDA Team\\R\\credentials_source.R")
# 
# 
# #### Step 1: load the data ####
# 
# # PUMS Data (previously downloaded from https://www2.census.gov/programs-surveys/acs/data/pums/)
# root <- "W:/Data/Demographics/PUMS/"
# 
# # Load the people PUMS data
# people <- fread(paste0(root, "CA_2023/psam_p06.csv"), header = TRUE, data.table = FALSE,
#                 colClasses = list(character = c("PUMA", "ANC1P", "ANC2P", "HISP", "RAC1P", "RAC2P", "RAC3P", "RACAIAN", "RACPI", "RACNH")))
# 
# 
# # Load the housing PUMS data
# housing <- fread(paste0(root, "CA_2023/psam_h06.csv"), header = TRUE, data.table = FALSE,
#                  colClasses = list(character = c("PUMA")))
# 
# 
# ####  Step 2: filter households eligible for calculation  #### 
# 
# ## Select LA County people
# eligible_hhs <- people %>% 
#   
#   #filtering for universe and LA county
#   filter(!is.na(POVPIP) & grepl('037', PUMA))   %>% 
# 
#     # join their housing info  
#     left_join(housing %>% filter(grepl('037', PUMA)), 
#               by = c("SERIALNO", "PUMA")) %>%
#   
#   #remove records with no weights
#   filter(!is.na(WGTP)) %>%
#   
#   #filter for age 0-5 and select distinct households
#   filter(AGEP < 5) %>% distinct(SERIALNO, .keep_all = TRUE)
# 
# 
# #### Step 3: get PUMA SPA xwalk and join ####
# 
# #get puma-spa xwalk (from postgres copy)
# conn <- connect_to_db("f5la_v2")
# puma_spa_xwalk <- st_read(conn, query = "SELECT * FROM puma_spa_xwalk_2022")
# dbDisconnect(conn)
# 
# # join
# hh <- eligible_hhs %>% left_join(puma_spa_xwalk, by = "PUMA", relationship = "many-to-many")
# 
# 
# #### Step 4 run survey function and format
# 
# # prep data and define indicator
# hh$geoid <- hh$SPA
# hh<-hh%>%
#   mutate(indicator=(ifelse(hh$POVPIP <= 100, "at or below poverty", "above poverty")))
# 
# source("ask_functions.R")
# total <- survey_return(hh)
# 
# # select burdened and not NA
# d_long <- total %>% filter(indicator == "at or below poverty" & !is.na(geoid))
# 
# # make data frame
# d_long <- as.data.frame(d_long)
# 
# # #write to one bigger csv with all data years
# write.csv(d_long, file = "CSVs/poverty_spa_2023.csv")

Poverty by Race 2023: 100% FPL

# Households with children under 5 in poverty by race
# among all eligible LA County households with children under 5, 2023
# Data Dictionary: https://www2.census.gov/programs-surveys/acs/tech_docs/pums/data_dict/PUMS_Data_Dictionary_2019-2023.pdf
# 
# library(tidyverse)
# library(data.table)
# library(readxl)
# library(tidycensus)
# library(srvyr)
# library(stringr)
# 
# #SOURCE from the script that has: styling, packages, dbconnection, colors
# source("W:\\RDA Team\\R\\credentials_source.R")
# 
# 
# #### Step 1 load the data ####
# 
# # PUMS Data (previously downloaded from https://www2.census.gov/programs-surveys/acs/data/pums/)
# root <- "W:/Data/Demographics/PUMS/"
# 
# # Load the people PUMS data
# people <- fread(paste0(root, "CA_2019_2023/psam_p06.csv"), header = TRUE, data.table = FALSE,
#                 colClasses = list(character = c("PUMA", "ANC1P", "ANC2P", "HISP", "RAC1P", "RAC2P", "RAC3P", "RACAIAN", "RACPI", "RACNH")))
# 
# 
# # Load the housing PUMS data
# housing <- fread(paste0(root, "CA_2019_2023/psam_h06.csv"), header = TRUE, data.table = FALSE,
#                  colClasses = list(character = c("PUMA")))
# 
# 
# #### Step 2 filter for LA & Recode racial-ethnic groups ####
# 
# people <- people %>% filter(grepl('037', PUMA)) 
# 
# #SOURCE recode function
# source("ask_functions.R")
# people_recoded <- race_recode(people)
# 
# 
# ####  Step 3: filter LA County households eligible for poverty calculation  #### 
# 
# ## Select LA County households with income-to-poverty ratios (universe) by
# 
# eligible_hhs <- people_recoded %>%
#   
#   #filtering for poverty eligible and LA county  
#   filter(!is.na(POVPIP) & (grepl('037', PUMA))) %>%
#   
#   #join households         
#   left_join(housing, by = c("SERIALNO", "PUMA")) %>%
#   
#   #remove records with no weights
#   filter(!is.na(WGTP)) %>%
#   
#   #filter for age 0-5 and select distinct households
#   filter(AGEP < 5) %>% distinct(SERIALNO, .keep_all = TRUE)
# 
# 
# #### Step 4: Set up surveys and calculate percentages by race/ethnicity
# 
# # survey design code
# 
# # Define weight variable and population base which will be used in the survey design set up
# ## You must use WGTP (if you are using psam_h06.csv and want housing units, like for Low Quality Housing) or PWGTP (if you want person units, like for Connected under5)
# weight <- 'WGTP' # using WGTP b/c calculating percentage of rent-burdened households
# 
# repwlist = rep(paste0("WGTP", 1:80))
# 
# # prep data and identify/calculate indicator
# hh <- eligible_hhs
# hh$geoid <- "037"
# 
# hh<-hh%>%
#   mutate(indicator=(ifelse(hh$POVPIP <= 100, "at or below poverty", "above poverty")))
# 
# # create survey design
# 
# hh_county <- hh %>%               
#   as_survey_rep(
#     variables = c(geoid, indicator, race, latino, aian, nhpi, swana),   # dplyr::select grouping variables
#     weights = weight,                       #  weight
#     repweights = repwlist,                  # list of replicate weights
#     combined_weights = TRUE,                # tells the function that replicate weights are included in the data
#     mse = TRUE,                             # tells the function to calc mse
#     type="other",                           # statistical method
#     scale=4/80,                             # scaling set by ACS
#     rscale=rep(1,80)                        # setting specific to ACS-scaling
#   )
# 
# 
# ###### Latino ######
# lat <- hh_county  %>%
#   group_by(geoid,latino,indicator) %>%   # group by race cat
#   summarise(
#     num = survey_total(na.rm=T), # get the (survey weighted) count for the numerators
#     rate = survey_mean()) %>%        # get the (survey weighted) proportion for the numerator
#   left_join(hh_county %>%                                        # left join in the denominators
#               group_by(geoid,latino) %>%                                     # group by geo
#               summarise(pop = survey_total(na.rm=T))) %>%              # get the weighted total for overall geo
#   mutate(rate=rate*100,
#          rate_moe = rate_se*1.645*100,    # calculate the margin of error for the rate based on se
#          rate_cv = ((rate_moe/1.645)/rate) * 100, # calculate cv for rate
#          count_moe = num_se*1.645, # calculate moe for numerator count based on se
#          count_cv = ((count_moe/1.645)/num) * 100)  # calculate cv for numerator count
# 
# 
# ###### NHPI ######
# nhpi <- hh_county  %>%
#   group_by(geoid,nhpi,indicator) %>%   # group by race cat
#   summarise(
#     num = survey_total(na.rm=T), # get the (survey weighted) count for the numerators
#     rate = survey_mean()) %>%        # get the (survey weighted) proportion for the numerator
#   left_join(hh_county %>%                                        # left join in the denominators
#               group_by(geoid,nhpi) %>%                                     # group by geo
#               summarise(pop = survey_total(na.rm=T))) %>%              # get the weighted total for overall geo
#   mutate(rate=rate*100,
#          rate_moe = rate_se*1.645*100,    # calculate the margin of error for the rate based on se
#          rate_cv = ((rate_moe/1.645)/rate) * 100, # calculate cv for rate
#          count_moe = num_se*1.645, # calculate moe for numerator count based on se
#          count_cv = ((count_moe/1.645)/num) * 100)  # calculate cv for numerator count
# 
# 
# ###### AIAN ######
# aian <- hh_county  %>%
#   group_by(geoid,aian,indicator) %>%   # group by race cat
#   summarise(
#     num = survey_total(na.rm=T), # get the (survey weighted) count for the numerators
#     rate = survey_mean()) %>%        # get the (survey weighted) proportion for the numerator
#   left_join(hh_county %>%                                        # left join in the denominators
#               group_by(geoid,aian) %>%                                     # group by geo
#               summarise(pop = survey_total(na.rm=T))) %>%              # get the weighted total for overall geo
#   mutate(rate=rate*100,
#          rate_moe = rate_se*1.645*100,    # calculate the margin of error for the rate based on se
#          rate_cv = ((rate_moe/1.645)/rate) * 100, # calculate cv for rate
#          count_moe = num_se*1.645, # calculate moe for numerator count based on se
#          count_cv = ((count_moe/1.645)/num) * 100)  # calculate cv for numerator count
# 
# 
# ###### SWANA ######
# swana <- hh_county  %>%
#   group_by(geoid,swana,indicator) %>%   # group by race cat
#   summarise(
#     num = survey_total(na.rm=T), # get the (survey weighted) count for the numerators
#     rate = survey_mean()) %>%        # get the (survey weighted) proportion for the numerator
#   left_join(hh_county %>%                                        # left join in the denominators
#               group_by(geoid,swana) %>%                                     # group by geo
#               summarise(pop = survey_total(na.rm=T))) %>%              # get the weighted total for overall geo
#   mutate(rate=rate*100,
#          rate_moe = rate_se*1.645*100,    # calculate the margin of error for the rate based on se
#          rate_cv = ((rate_moe/1.645)/rate) * 100, # calculate cv for rate
#          count_moe = num_se*1.645, # calculate moe for numerator count based on se
#          count_cv = ((count_moe/1.645)/num) * 100)  # calculate cv for numerator count
# 
# 
# ###### RACE ######
# race <- hh_county  %>%
#   group_by(geoid,race,indicator) %>%   # group by race cat
#   summarise(
#     num = survey_total(na.rm=T), # get the (survey weighted) count for the numerators
#     rate = survey_mean()) %>%        # get the (survey weighted) proportion for the numerator
#   left_join(hh_county %>%                                        # left join in the denominators
#               group_by(geoid,race) %>%                                     # group by geo
#               summarise(pop = survey_total(na.rm=T))) %>%              # get the weighted total for overall geo
#   mutate(rate=rate*100,
#          rate_moe = rate_se*1.645*100,    # calculate the margin of error for the rate based on se
#          rate_cv = ((rate_moe/1.645)/rate) * 100, # calculate cv for rate
#          count_moe = num_se*1.645, # calculate moe for numerator count based on se
#          count_cv = ((count_moe/1.645)/num) * 100)  # calculate cv for numerator count
# 
# 
# ###### TOTAL ######
# total <- hh_county  %>%
#   group_by(geoid,indicator) %>%   # group by race cat
#   summarise(
#     num = survey_total(na.rm=T), # get the (survey weighted) count for the numerators
#     rate = survey_mean()) %>%        # get the (survey weighted) proportion for the numerator
#   left_join(hh_county %>%                                        # left join in the denominators
#               group_by(geoid) %>%                                     # group by geo
#               summarise(pop = survey_total(na.rm=T))) %>%              # get the weighted total for overall geo
#   mutate(rate=rate*100,
#          rate_moe = rate_se*1.645*100,    # calculate the margin of error for the rate based on se
#          rate_cv = ((rate_moe/1.645)/rate) * 100, # calculate cv for rate
#          count_moe = num_se*1.645, # calculate moe for numerator count based on se
#          count_cv = ((count_moe/1.645)/num) * 100)  # calculate cv for numerator count
# 
# 
# ####  Step 5: format  ####
# 
# # rename race name columns as subgroup
# total$subgroup = "Total"
# total <- total %>% select(geoid, subgroup, everything())
# 
# aian <- aian %>% rename(subgroup = aian)
# lat <- lat %>% rename(subgroup = latino)
# nhpi <- nhpi %>% rename(subgroup = nhpi)
# race <- race %>% rename(subgroup = race)
# swana <- swana %>% rename(subgroup = swana)
# 
# # merge tables except for bipoc - need total
# d_long <- rbind(total, aian, lat, race, nhpi, swana) %>%
#   filter(indicator == "at or below poverty" & 
#            subgroup != "Not AIAN" &
#            subgroup != "Not Latinx" &
#            subgroup != "Not NHPI" &
#            subgroup != "Not SWANA" &
#            subgroup != "AIAN placeholder" &
#            subgroup != "NHPI placeholder" &
#            subgroup != "Latinx placeholder")
# 
# d_long <- as.data.frame(d_long)
# d_long$geoid <- "06037"
# d_long <- d_long %>% select(geoid, everything())
# 
# # #write to one bigger csv with all data years
# write.csv(d_long, file = "CSVs/poverty_race_2019_23.csv")

Poverty by County 2023: 100% FPL

# Households with children under 5 in poverty
# as percentage of all eligible LA County households with children under 5, 2023

# Data Dictionary: www2.census.gov/programs-surveys/acs/tech_docs/pums/data_dict/PUMS_Data_Dictionary_2023.pdf

# library(tidyverse)
# library(data.table)
# library(readxl)
# library(tidycensus)
# library(srvyr)
# library(stringr)
# 
# #SOURCE from the script that has: styling, packages, dbconnection, colors
# source("W:\\RDA Team\\R\\credentials_source.R")
# 
# 
# #### Step 1: load the data ####
# 
# # PUMS Data (previously downloaded from https://www2.census.gov/programs-surveys/acs/data/pums/)
# root <- "W:/Data/Demographics/PUMS/"
# 
# # Load the people PUMS data
# people <- fread(paste0(root, "CA_2023/psam_p06.csv"), header = TRUE, data.table = FALSE,
#                 colClasses = list(character = c("PUMA", "ANC1P", "ANC2P", "HISP", "RAC1P", "RAC2P", "RAC3P", "RACAIAN", "RACPI", "RACNH")))
# 
# 
# # Load the housing PUMS data
# housing <- fread(paste0(root, "CA_2023/psam_h06.csv"), header = TRUE, data.table = FALSE,
#                  colClasses = list(character = c("PUMA")))
# 
# 
# ####  Step 2: filter households eligible for calculation  #### 
# 
# ## Select LA County people
# eligible_hhs <- people %>% 
#   
#   #filtering for universe and LA county
#   filter(!is.na(POVPIP) & grepl('037', PUMA))   %>% 
#   
#   # join their housing info  
#   left_join(housing %>% filter(grepl('037', PUMA)), 
#             by = c("SERIALNO", "PUMA")) %>%
#   
#   #remove records with no weights
#   filter(!is.na(WGTP)) %>%
#   
#   #filter for age 0-5 and select distinct households
#   filter(AGEP < 5) %>% distinct(SERIALNO, .keep_all = TRUE)
# 
# 
# ####  Step 3: set up and run survey and format  #### 
# 
# # add geoid and indicator
# eligible_hhs$geoid <- "037"
# eligible_hhs$indicator=(ifelse(eligible_hhs$POVPIP <= 100, "at or below poverty", "above poverty"))
# 
# #SOURCE survey return function
# source("ask_functions.R")
# total <- survey_return(eligible_hhs)
# 
# 
# # select burdened and not NA
# d_long <- total %>% filter(indicator == "at or below poverty" & !is.na(geoid))
# 
# # make data frame
# d_long <- as.data.frame(d_long)
# 
# 
# #### Step 4: Repeat steps 2 and 3 above without the 0-5 filter ####
# 
# ## Select LA County people
# eligible_hhs2 <- people %>% 
#   
#   #filtering for universe and LA county
#   filter(!is.na(POVPIP) & grepl('037', PUMA))   %>% 
#   
#   # join their housing info  
#   left_join(housing %>% filter(grepl('037', PUMA)), 
#             by = c("SERIALNO", "PUMA")) %>%
#   
#   #remove records with no weights
#   filter(!is.na(WGTP)) %>%
#   
#   #filter for age 0-5 and select distinct households
#   distinct(SERIALNO, .keep_all = TRUE)
# 
# # add geoid and indicator
# eligible_hhs2$geoid <- "037"
# eligible_hhs2$indicator=(ifelse(eligible_hhs2$POVPIP <= 100, "at or below poverty", "above poverty"))
# 
# 
# #run survey return function
# total <- survey_return(eligible_hhs2)
# 
# # select burdened and not NA
# d_long2 <- total %>% filter(indicator == "at or below poverty" & !is.na(geoid))
# 
# # make data frame
# d_long2 <- as.data.frame(d_long2)
# 
# # bind both data frames in final
# d_final <- rbind(d_long, d_long2)
# 
# # #write to one bigger csv with all data years
# write.csv(d_final, file = "CSVs/poverty_county_2023.csv")

Public Use Microdata Areas (PUMAs) to SPA Crosswalk code

Below is the code used to match PUMAs to SPAs:

# library(tidyverse)
# library(data.table)
# library(readxl)
# library(tidycensus)
# library(srvyr)
# library(stringr)
# library(sf)
# library(tigris)
# options(scipen=999)
# 
# #SOURCE from the script that has: styling, packages, dbconnection, colors
# source("W:\\RDA Team\\R\\credentials_source.R")
# 
# # import SPA shapefile from https://egis-lacounty.hub.arcgis.com/datasets/lacounty::service-planning-areas-2022-view/about 
# # previously downloaded to your working directory
# spas <- st_read("Shapes/Service_Planning_Areas__2022_.shp")
# 
# # convert to same crs
# spas <- st_transform(spas, crs = 2229)
# 
# # get pumas, filter for LA County PUMAs and convert to same crs
# pumas <- pumas(state = "06", year = 2022)
# state <- states(year = 2022, cb = TRUE) %>% filter(GEOID == '06') # pull in CA CBF shape
# cb_pumas <- st_intersection(pumas, state) %>% filter(grepl("06037", GEOID20)) # create LA County CBF PUMAs by clipping CBF state shape and filtering geoids
# cb_pumas <- st_transform(cb_pumas, crs = 2229) # reproject CBF PUMAs
# 
# 
# # calculate area of spas and pumas
# spas$spas_area <- st_area(spas)
# 
# cb_pumas$cb_pumas_area <- st_area(cb_pumas) # Calc CBF PUMA area
# 
# # intersect spas and pumas
# intersects <- st_intersection(cb_pumas, spas)
# 
# # calculate area of the intersect
# intersects$intersect_area <- st_area(intersects)
# 
# # calculate percent of intersect out of spa and puma areas
# intersects$prc_spas_area <- as.numeric(intersects$intersect_area/intersects$spas_area)
# 
# intersects$prc_pumas_area <- as.numeric(intersects$intersect_area/intersects$cb_pumas_area)
# 
# # xwalk with twenty percent threshold
# puma_spa_xwalk <- intersects %>% filter(prc_pumas_area>=.20) %>%
#   
#   # select fields we need
#   select(PUMACE20, NAMELSAD20, SPA, SPA_NAME, prc_spas_area, prc_pumas_area) %>%
#   
#   # rename PUMA
#   rename(PUMA = PUMACE20) %>%
#   
#   # keep only unique rows -- needed when using CBF PUMAs
#   unique()